Research Paper: Semi-automated Construction of Decision Rules to Predict Morbidities from Clinical Texts

نویسندگان

  • Richárd Farkas
  • György Szarvas
  • István Hegedüs
  • Attila Almási
  • Veronika Vincze
  • Róbert Ormándi
  • Róbert Busa-Fekete
چکیده

OBJECTIVE In this study the authors describe the system submitted by the team of University of Szeged to the second i2b2 Challenge in Natural Language Processing for Clinical Data. The challenge focused on the development of automatic systems that analyzed clinical discharge summary texts and addressed the following question: "Who's obese and what co-morbidities do they (definitely/most likely) have?". Target diseases included obesity and its 15 most frequent comorbidities exhibited by patients, while the target labels corresponded to expert judgments based on textual evidence and intuition (separately). DESIGN The authors applied statistical methods to preselect the most common and confident terms and evaluated outlier documents by hand to discover infrequent spelling variants. The authors expected a system with dictionaries gathered semi-automatically to have a good performance with moderate development costs (the authors examined just a small proportion of the records manually). MEASUREMENTS Following the standard evaluation method of the second Workshop on challenges in Natural Language Processing for Clinical Data, the authors used both macro- and microaveraged Fbeta=1 measure for evaluation. RESULTS The authors submission achieved a microaverage F(beta=1) score of 97.29% for classification based on textual evidence (macroaverage F(beta=1) = 76.22%) and 96.42% for intuitive judgments (macroaverage F(beta=1) = 67.27%). CONCLUSIONS The results demonstrate the feasibility of the authors approach and show that even very simple systems with a shallow linguistic analysis can achieve remarkable accuracy scores for classifying clinical records on a limited set of concepts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rough sets theory in site selection decision making for water reservoirs

Rough Sets theory is a mathematical approach for analysis of a vague description of objects presented by a well-known mathematician, Pawlak (1982, 1991). This paper explores the use of Rough Sets theory in site location investigation of buried concrete water reservoirs. Making an appropriate decision in site location can always avoid unnecessary expensive costs which is very important in constr...

متن کامل

Cost Function Modelling for Semi-automated SC, RTG and Automated and Semi-automated RMG Container Yard Operating Systems

This study analyses the concept of cost functions for semi-automated Straddle Carrier (SC), Rubber Tyred Gantry (RTG) and automated Rail Mounted Gantry (RMG) container yard operating cranes. It develops a generic cost based model for a pair-wise comparison, analysis and evaluation of economic efficiency and effectiveness of container yard equipment to be used for decision-making by terminal pla...

متن کامل

Automated Detection of Financial Events

Today’s financial markets are inextricably linked with financial events like acquisitions, profit announcements, or product launches. Information extracted from news messages that report on such events could hence be beneficial for financial decision making. The ubiquity of news, however, makes manual analysis impossible, and due to the unstruc tured nature of text, the (semi-)automatic extract...

متن کامل

Are Autonomous Mobile Robots Able to Take Over Construction? A Review

Although construction has been known as a highly complex application field for autonomous robotic systems, recent advances in this field offer great hope for using robotic capabilities to develop automated construction. Today, space research agencies seek to build infrastructures without human intervention, and construction companies look to robots with the potential to improve construction qua...

متن کامل

A Data Mining approach for forecasting failure root causes: A case study in an Automated Teller Machine (ATM) manufacturing company

Based on the findings of Massachusetts Institute of Technology, organizations’ data double every five years. However, the rate of using data is 0.3. Nowadays, data mining tools have greatly facilitated the process of knowledge extraction from a welter of data. This paper presents a hybrid model using data gathered from an ATM manufacturing company. The steps of the research are based on CRISP-D...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of the American Medical Informatics Association : JAMIA

دوره 16 4  شماره 

صفحات  -

تاریخ انتشار 2009